Data Augmentation Using Multi-Input Multi-Output Source Separation for Deep Neural Network Based Acoustic Modeling

نویسندگان

Yusuke Fujita

Ryoichi Takashima

Takeshi Homma

Masahito Togami

چکیده

We investigate the use of local Gaussian modeling (LGM) based source separation to improve speech recognition accuracy. Previous studies have shown that the LGM based source separation technique has been successfully applied to the runtime speech enhancement and the speech enhancement of training data for deep neural network (DNN) based acoustic modeling. In this paper, we propose a data augmentation method utilizing the multi-input multi-output (MIMO) characteristic of LGM based source separation. We first investigate the difference between unprocessed multi-microphone signals and multi-channel output signals from LGM based source separation as augmented training data for DNN based acoustic modeling. Experimental results using the third CHiME challenge dataset show that the proposed data augmentation outperforms the conventional data augmentation. In addition, we experiment the beamforming applied to the source separated signals as runtime speech enhancement. The results show that the proposed runtime beamforming further improves the speech recognition accuracy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modeling heat transfer of non-Newtonian nanofluids using hybrid ANN-Metaheuristic optimization algorithm

An optimal artificial neural network (ANN) has been developed to predict the Nusselt number of non-Newtonian nanofluids. The resulting ANN is a multi-layer perceptron with two hidden layers consisting of six and nine neurons, respectively. The tangent sigmoid transfer function is the best for both hidden layers and the linear transfer function is the best transfer function for the output layer....

متن کامل

Development of an in-cylinder processes model of a CVVT gasoline engine using artificial neural network

Today, employing model based design approach in powertrain development is being paid more attention. Precise, meanwhile fast to run models are required for applying model based techniques in powertrain control design and engine calibration. In this paper, an in-cylinder process model of a CVVT gasoline engine is developed to be employed in extended mean valve control oriented model and also mod...

متن کامل

Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods

Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...

متن کامل

Rejection of the Feed-Flow Disturbances in a Multi-Component Distillation Column Using a Multiple Neural Network Model-Predictive Controller

This article deals with the issues associated with developing a new design methodology for the nonlinear model-predictive control (MPC) of a chemical plant. A combination of multiple neural networks is selected and used to model a nonlinear multi-input multi-output (MIMO) process with time delays. An optimization procedure for a neural MPC algorithm based on this model is then developed. T...

متن کامل

Acoustic scene classification using convolutional neural network and multiple-width frequency-delta data augmentation

In recent years, neural network approaches have shown superior performance to conventional hand-made features in numerous application areas. In particular, convolutional neural networks (ConvNets) exploit spatially local correlations across input data to improve the performance of audio processing tasks, such as speech recognition, musical chord recognition, and onset detection. Here we apply C...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Data Augmentation Using Multi-Input Multi-Output Source Separation for Deep Neural Network Based Acoustic Modeling

نویسندگان

چکیده

منابع مشابه

Modeling heat transfer of non-Newtonian nanofluids using hybrid ANN-Metaheuristic optimization algorithm

Development of an in-cylinder processes model of a CVVT gasoline engine using artificial neural network

Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods

Rejection of the Feed-Flow Disturbances in a Multi-Component Distillation Column Using a Multiple Neural Network Model-Predictive Controller

Acoustic scene classification using convolutional neural network and multiple-width frequency-delta data augmentation

عنوان ژورنال:

اشتراک گذاری